RSS Conference 2023
A combination of theft, anecdotes and commentary.
Replicable: if the experiment were repeated by an independent investigator, you would get slightly different data but would the substative conclusions be the same?
In the specific sense, this is the core worry for a statistician!
Also used more generally: are results stable to perturbations in population / study design / modelling / analysis?
Only real test is to try it. Control risk with shadow and parallel deployment.
Reproducible: given the original raw data and code, can you get all of the results again?
Reproducible != Correct
“Code available on request” is the new “Data available on request”
Reproducible data analysis requires effort, time and skill.
A study is reproducible if you can take the original data and the computer code used to analyze the data and recreate all of the numerical findings from the study.
Broman et al. (2017) “Recommendations to Funding Agencies for Supporting Reproducible Research”
Reproducibility is good for science and good for the individual.
Publicly sharing code as well as data: importance of documentation & testing.
Teaching materials made with literate programming or WYWIWYG:
Steal from reproducible reporting to give each student their own dataset to analyse and produce individualised mark schemes.
Training current / future colleges.
Good programmes cover reproducible modelling
Notebooks controlled environment but more need for scripting and literate reporting.
Can also take a broader view of reproducibility: monotonous jobs you have to do repeatedly that take a long time to do.
From my experience:
Getting Your Work to Work - RSS Conference 2023 - Zak Varty